A Freely Available Morphological Analyzer for Turkish
نویسنده
چکیده
This paper presents TRmorph, a two-level morphological analyzer for Turkish. TRmorph is a fairly complete and accurate morphological analyzer for Turkish. However, strength of TRmorph is neither in its performance, nor in its novelty. The main feature of this analyzer is its availability. It has completely been implemented using freely available tools and resources, and the two-level description is also distributed with a license that allows others to use and modify it freely for different applications. To our knowledge, TRmorph is the first freely available morphological analyzer for Turkish. This makes TRmorph particularly suitable for applications where the analyzer has to be changed in some way, or as a starting point for morphological analyzers for similar languages. TRmorph’s specification of Turkish morphology is relatively complete, and it is distributed with a large lexicon. Along with the description of how the analyzer is implemented, this paper provides an evaluation of the analyzer on two large corpora.
منابع مشابه
A set of open source tools for Turkish natural language processing
This paper introduces a set of freely available, open-source tools for Turkish that are built around TRmorph, a morphological analyzer introduced earlier in Çöltekin (2010a). The article first provides an update on the analyzer, which includes a complete rewrite using a different finite-state description language and tool set as well as major tagset changes to comply better with the state-of-th...
متن کاملTurkish word segmentation using morphological analyzer
This paper describes an algorithm to segment an input Turkish string without any spaces, which may be an output of a speech-to-text application, into words by using morphological analyzer. It is quite possible to use the algorithm on other languages, which has a morphological analysis component, as well. Turkish morphological analyzer is designed and implemented as the linguistic engine of the ...
متن کاملA Morphological Analyzer for Turkish Using Combinator Parsing
In this paper, implementation of a morphological analyzer for Turkish using the combinator parsing method is discussed. A functional language is used for implementation. The analyzer handles all the morphophonemic processes of Turkish, and models the morphotactics as higher-order functions. The code of the analyzer is informative about linguistic point of view because of the declarative power o...
متن کاملRapid Development of Morphological Analyzers for Typologically Diverse Languages
The Low Resource Language research conducted under DARPA’s Broad Operational Language Translation (BOLT) program required the rapid creation of text corpora of typologically diverse languages (Turkish, Hausa, and Uzbek) which were annotated with morphological information, along with other types of annotation. Since the output of morphological analyzers is a significant aid to morphological anno...
متن کاملAn Affix Stripping Morphological Analyzer for Turkish
This paper presents the design and the implementation of a morphological analyzer for Turkish. A new methodology is proposed for doing the analysis of Turkish words with an affix stripping approach and without using any lexicon. The rule-based and agglutinative structure of the language allows Turkish to be modeled with finite state machines (FSMs). In contrast to the previous works, in this st...
متن کامل